Fast Scoring for PLDA with Uncertainty Propagation
نویسنده
چکیده
By treating utterances as points in the i-vector space, ivector/PLDA can achieve fast verification. However, this approach lacks the ability to cope with utterance-length variability. A method called uncertainty propagation (UP) that takes the uncertainty of i-vectors into account has been recently proposed to deal with this problem. However, the loading matrix for modeling utterance-length variability is session-dependent, making UP computationally expensive. In this paper, we demonstrate that utterance-length variability mainly affects the scale of the posterior covariance matrices. Based on this observation, we propose to substitute the session-dependent loading matrices by the ones trained from development data, where the selection of pre-computed loading matrices is based on a fast scalar comparison. This approach can reduce the computation cost of standard UP to the one comparable with the conventional PLDA. Experiments on the NIST 2012 Speaker Recognition Evaluation show that the proposed method can perform as good as the standard UP, but requires only 3.7% of the scoring time. The method also requires substantially less memory as compared with the standard UP, especially when the number of target speakers is large.
منابع مشابه
Fast scoring for PLDA with uncertainty propagation via i-vector grouping
The i-vector/PLDA framework has gained huge popularity in text-independent speaker verification. This approach, however, lacks the ability to represent the reliability of i-vectors. As a result, the framework performs poorly when presented with utterances of arbitrary duration. To address this problem, a method called uncertainty propagation (UP) was proposed to explicitly model the reliability...
متن کاملAccounting for uncertainty of i-vectors in speaker recognition using uncertainty propagation and modified imputation
One of the biggest challenges in speaker recognition is incomplete observations in test phase caused by availability of only short duration utterances. The problem with short utterances is that speaker recognition needs to be handled by having information from only limited amount of acoustic classes. By considering limited observations from a test speaker, the resulting i-vector as a representa...
متن کاملText-dependent speaker recognition using PLDA with uncertainty propagation
In this paper, we apply and enhance the i-vector-PLDA paradigm to text-dependent speaker recognition. Due to its origin in text-independent speaker recognition, this paradigm does not make use of the phonetic content of each utterance. Moreover, the uncertainty in the i-vector estimates should be taken into account in the PLDA model, due to the short duration of the utterances. To bridge this g...
متن کاملI-Vector/PLDA Variants for Text-Dependent Speaker Recognition
The i-vector/PLDA approach currently dominates the field of text-independent speaker recognition and the question of how to translate this methodology to the text-dependent domain has recently become an active area of research. The essential difference between the two fields is that it is possible to do speaker recognition with enrollment and test utterances of very short duration in the text-d...
متن کاملFast variational Bayes for heavy-tailed PLDA applied to i-vectors and x-vectors
The standard state-of-the-art backend for text-independent speaker recognizers that use i-vectors or x-vectors, is Gaussian PLDA (G-PLDA), assisted by a Gaussianization step involving length normalization. G-PLDA can be trained with both generative or discriminative methods. It has long been known that heavy-tailed PLDA (HT-PLDA), applied without length normalization, gives similar accuracy, bu...
متن کامل